Use of ALT texts in IMGs

Introduction

On the WWW, I see a lot of confusion about the appropriate use of ALT texts in HTML. Although the finer points could be argued, I believe the general principles are more or less as I have set them out in this note. At least, I commend them to you, and look forward to reasoned discussion if you disagree.

Some readers say they prefer to see my collection of howlers first.

[joke alert] Lynx users are the crême de la crême!

Principles

The first principle of HTML authoring, as far as I am concerned, is to convey information to the reader about some "topic of discourse". That's a high-falutin' way of saying that one is writing a story, listing product information, offering scientific results, giving a tutorial on basket-weaving, recipes for cooking wild mushrooms, or whatever the "topic of discourse" happens to be. I'm not considering the special case when one is writing a document about HTML or about the WWW - that just confuses the issue.

Reference to the mechanics of the World Wide Web or of a particular browser is almost always an unwelcome distraction. Authors should (with rare and specific exceptions) assume that users are already familiar with the operation of their browser, and are hungry for some information about the topic of discourse. There are differences between browsers, and trying to tell the user how to use the browser with which you, the author, are familiar, is very likely to confuse them if they are using a different browser/version/platform. [1]

This note is about the kind of WWW document that is primarily textual, whose images, where used, are an adjunct to the textual information. In other words, the document is inherently accessible to text-only readers, and it's the author's job to make sure that such readers get the best that they can out of the document, in spite of the absence of images. I don't cover the kind of WWW document whose information is, for whatever reason, genuinely dependent on images, since that is not going to be accessible to text-only readers anyway.

This note is not asking you to "dumb-down" your documents in order to make them work on text-mode browsers. What it is doing is asking you to give thought to how your documents will come across in a wide range of browsing situations, and to follow an authoring style that will make best use of whatever combination of resources each different reader has at their disposal. That is the big difference between the concept of the WWW and most of the earlier concepts for making information available on the 'net.

HTML mechanisms for offering images

When you want to offer the user a picture (graphic, visual, call it what you will), HTML offers two ways of going about it: an inline IMG, or a regular link (anchor) that points to an image. You may decide to use one or other, or both, of these. (Those are the only mechanisms in HTML 2.0 and 3.2: there will be improved mechanisms in future HTML versions, e.g the OBJECT tag.) If you are offering a link to an image, then your options are to include, within the scope of that link anchor, either an IMG or some normal text, or both. So there is a wide choice of combinations, each of which could be applicable in particular circumstances. [2]

The ALT text itself is not allowed to be marked up by including HTML tags within the text; on the other hand the IMG (and thus its ALT text) is governed by any tags that enclose it, so, for example, if your IMG is a heading, or forms part of a heading, be sure to enclose it between the Hn.../Hn tags. It's clear from the HTML2.0 spec that the alt text is intended to support the &entityname; and &#number; mechanisms, but there were some browsers that didn't even honour that, and I guess there may still be people using them. So, be conservative about what you put into alt texts. Hint: some browsers display badly when the input HTML has a line break within an alt text, so it is advisable to keep the whole alt text string on a single input line. If necessary, start on a fresh input line before the ALT= attribute, and end the input line after the closing quotation mark, but don't break the input line anywhere between. As far as I know, they will all flow the actual text string onto two or more displayed lines, this is not a problem.

Why should authors bother with ALT texts?

Well, from the fact that you're reading this article, I hope you already think it's a good idea, but I have written some notes [3] on this topic.

Appropriate choice of ALT texts

I recommend Henry Churchyard's small page on which he quotes some usenet discussions about alt texts.

I mentioned the alt text problem in an email to Callie at Writepage, and received by return a detailed response that deserves to be published as a WWW page in its own right. Here just a short extract. As Callie says:

Many authors haven't figured out exactly what they are trying to present; they don't know what it is about the image that's important to the page's intended audience. The reason you can't figure out why their alt tags aren't working is that they don't know why the images are there. Every graphic has a reason for being on that page: because it either enhances the theme/ mood/ atmosphere or it is critical to what the page is trying to explain. Knowing what the image is for ... makes the labels easier to write.

When you write predominantly-textual material in HTML, you address three different kinds of user:

I. Those with image loading enabled.: Example: any graphical browser
II. Those browsing in text mode, but having image display available if they so choose.: Examples: graphical browser with auto image loading off; Lynx with a graphical viewer available as helper application
III. Those who have text mode only, and cannot display images at all.: Examples: character mode terminal; readers who use a speaking machine.

The speaking machine isn't only for blind readers, although why people would want to put additional difficulties in the way of blind people accessing their textual material I can't imagine; a sighted reader might use a speaking machine for example while driving along, and there is even a service, the "Web-on-Call^(TM) Voice Browser", for reading out textual WWW pages over the telephone. Users with physical disabilities may also find it difficult or impossible to use the interface of a normal graphical browser.

When you use an inline image, the ALT text is your tool - not a very precise tool, but a serviceable tool neverthless - to get your message over to readers of types II and III. There are several different reasons why you might be making images available to your reader, so, not surprisingly, there are several different approaches to choosing an alt text. I found it helpful to categorise four main types of image. These are not meant to be in any order of priority: each use has its proper field of applicability.

a) "Page toys" (Callie's term: I call them "decorations").

If your reader doesn't display these, then there is nothing that an alt text can usefully add to the topic of discourse, and there is little likelihood that a normal reader who is running in text mode will want to view or download such a decoration. So, code ALT="" in most cases. In the event that you are using one as a link anchor, be sure to include some text in the scope of the anchor too (it is always good authoring style to make the significant text be the link, rather than some insignificant bullet or, so help me, "click here". Cautionary or interrogatory icons might be replaced by something like "[!]" or "[?]", and bullets with ALT="*" etc.

As far as company logos are concerned, well, if the name of the company is already on the page in clear text (as is often the case), then the logo can be treated as decoration, and ALT="" is the appropriate choice; if the logo were being used instead of the company name, then ALT="Foo Corporation" would be the correct choice. (If you are a vendor of logos, exhibiting specimens of your wares, then of course different considerations may apply!).

The often-seen variations on ALT="Company Logo", ALT="Logo of Foo Corporation", or even ALT="Medium size GIF of logo" (!) are incomprehensible: the author is supposed to be providing the reader with information, not with meta-information (description of information). For the "type III" reader, references to pictures that they cannot see are useless and frustrating.

b) "Navigation Icons"

These are, of course, links to other documents. The text equivalent is usually simple enough, being a short description of the target ("Adventure", "Science Fiction") or function ("ToC", "Next Chapter", "Previous", "Foo Corporation Home Page"). Please bear in mind that browsers already have their own meaning for terms such as "Back", "Home" and maybe "Forward", so it is best to avoid the potential confusion that results when the author makes these terms mean something else. Please avoid those "Return to xxx" links that are so irritating to someone who went directly to your page, and has never visited xxx before. I prefer just "xxx", but you can put "Go to xxx" if you feel that you must. (Each browser has its way of showing the reader that a text is a link - often showing it underlined or boxed, and/or in a distinctive colour, and/or flashing up a URL in a status area, so the "Go to" is redundant IMHO.) Another tip: real HTML authors don't write "Click here", and experienced WWW readers feel themselves "talked-down-to" when they see this phrase. If the icons are in fact thumbnails for "navigating" to a fullsize image of the same thing, then see later discussion under (c).

Imagemaps are a special case of this category. In some situations, imagemaps play a role that cannot directly be substituted by anything else (geographical maps, for example), but you should still be offering some other means of navigation for text-mode readers, such as text-mode links, an A-Z index, a search engine etc. Often, though, on the WWW, one sees imagemaps used instead of simple text or IMG links, presumably on the grounds that it's more complex and so demonstrates the author's prowess. This is a pity, if the author doesn't also have the prowess to make their page usable for all text-mode readers. At the least, they could supply a row of text-based links. But when you've provided text-based links, do you really need the imagemap? The imagemap (especially the server-side imagemap) has a number of disadvantages on the WWW. In many cases that I have seen, authors seem to have used them to construct a non-intuitive bran-tub (lucky dip) for confusing and frustrating their readers. A row of simple IMG with ALTs can do the job in a technically more effective manner, that adapts better to changes of window size, as well as exploiting caching in ways that imagemaps cannot. I recall with pleasure a site that had a row of six large but simple IMG navigation buttons, that happily lined themselves up as two rows of three, three rows of two and a stack of six, as I adjusted my window size; and that came equipped with the proper alt texts for use by text-based users. Most of them were re-used in that site's other pages, for which the typical browser would use them from its cache. The corresponding imagemap cannot adjust to the user's choice of window size, and if similar - but different - images are used as navigational imagemaps on different pages, then every one has to be loaded separately.

More and more browser/versions are supporting (Spyglass type) client-side imagemaps. If, in addition, the image used for a site's navigation imagemaps is the same on every page, then there's a larger range of situations where you might consider using a dual client-side/server-side imagemap for navigation. But still, don't overlook the simple text-based solution. You can, as Henry Churchyard recommends, hide an imagemap from Lynx by using ALT="" if there are other adequate text mode navigation tools present; but the next paragraph shows that it's not the only approach.

Users of Lynx (except for very old releases) can "use" a server-side imagemap, provided that the 0,0 co-ordinates do something useful. It's frustrating if that produces a response such as "You clicked in the wrong place, please try again". Instead of intimidating the user, the author should have provided an alternative means of navigation, or, at the very least, have allowed 0,0 to go somewhere useful - for example, to a text navigation page. If you have decided that an imagemap is the appropriate technique, then a good solution all round is to provide a dual client-side/server-side map, taking care to supply ALT attributes on the AREA tags of the client-side map: then, recent versions of Lynx will be able to use the client-side map, while old-ish versions can still take the "0,0" option of the server-side map, the importance of which has just been mentioned. Did I say you should provide separate graphical and text-only pages? - I did not: in general I don't believe you should, and those people who keep yelling "I can't afford the time to make text mode versions of my pages" are just looking for any inane excuse to make their site inaccessible to text mode users. A page with text mode navigation on it (such as an index or table of contents, or a search engine) can be just as useful to a graphical browser as it is to a text-mode browser, you shouldn't devalue it as an extra chore for text-only users.

c) "Supplemental or Interesting"

These are graphics that the user may find helps their appreciation of the text, but are not mandatory to it. I suggest that there are two ways of going about this.
i) Provide only links to them; show the reader what's on offer with a brief description. Now, the principle says not to fuss about details of the WWW, but in this case, what with limited bandwidth and the possibility of not all image formats being accepted by all browsers, we can make an exception and warn the reader what it is that we are offering here. So an example could be

<a href="...">Frigate, circa 1800, 160kB
p-JPEG</a>

ii) Provide an inline IMG, with an ALT text that summarises the major feature that you wanted to bring to the reader's attention, e.g
ALT="Warships at that time usually had two rows of cannon"

You should be able to word the body of your text so that it doesn't pre-suppose the reader is also viewing the image alongside. As long as readers of "type II" are aware that an image is available, they can make their own decision whether to load it. Giving readers of "type III" the impression that you are commanding them to load an image, will only frustrate and annoy them.

Of course, you could combine an inline IMG, with its ALT text, together with a link to an out of line image (either the same image, or a larger, more detailed version of it). In this case, take care to put the information in its proper place. If your inline is just a thumbnail, then both text-only and graphical users will be interested in what is effectively a caption for the (out of line) image, so put this text, e.g "Frigate, circa 1800, 160kB p-JPEG" as ordinary text inside the anchor, just as it was shown above, rather than putting it as alt text. (If, on the other hand, there is just the one image, available both as an inline and via a link, then someone who has already loaded the image in-line won't need to be told its vital statistics, so it would seem appropriate to put the "160kB, p-JPEG" into the alt text. But these are minor details, compared with the kind of alt texts that I am criticising in this article.)

If you feel that the IMG needs some additional alt text, provide it; if not, then put ALT="" as usual. This alt text could typically supply the chief piece of information for which you had provided the picture, e.g following the above example, you might describe the major relevant feature of the vessel illustrated by the picture. The test of appropriateness, as ever, is to imagine the HTML document viewed without the picture, and ask yourself whether the ALT text supplies useful information about the field of discourse. (Technical information about the missing images is, as I say, tolerated if it's there for good reason, but in an article on historic ships remember that the reader primarily wants information about ships, not technical woffle about the WWW.)

d) Critical for understanding the page

In some fields (Callie's, for instance) this situation rarely arises. In others (science and engineering, and mathematics so long as we have to put equations in as inline images), it's quite a common occurrence. In this case, provided you have somehow made the reader aware that the document will unfortunately be meaningless without them loading the inline images, it hardly matters what you put in the ALT text. If you have a mixture of optional graphics with a few mandatory ones, maybe you could consider using the ALT text of the mandatory graphics to inform the user that they need to load this particular image for proper understanding; most browsers support selective loading of a few desired inline graphics, even where the reader is unwilling to load a heap of decoration over a slow network.

Further discussion and thoughts

I don't think cases (a) and (b) really need a great deal of discussion. (c) and (d) are trickier, and it's not always obvious which of the two we're dealing with. One reader's essential illustrations are another reader's optional extras, and in the more borderline cases it's possible to make an image seem to be either essential or supplemental depending on just how you word the text (as Toby Speight pointed out).

In a situation where there is some agreed scheme for a textual representation of the image, then you could use that as the ALT text. If your audience is accustomed to reading mathematical equations in L^AT_EX notation, you could use that as alt text for the image of the equation. If you are dealing with heraldry, then you might use the appropriate heraldic description or "blazon", Three Seaxes Argent in pale on a field Vert. In fairness, there are cases where a graphic is absolutely essential to the meaning, and no reasonable amount of text can possibly replace it. But this is no excuse for providing useless alt texts in those situations where a useful one could be provided.

We had an example a little while back in which someone had suggested (in the context of holiday offers) an alt text that said ALT="Picture of Hotel". I say this is inappropriate because it tells us nothing about the subject of discourse - instead, it tells us chiefly about the mechanics of the WWW. What the reader, particularly a reader of "type III", wants to know is - what does the picture show that's relevant to the topic of discourse? The picture might show, for example: ALT="The Pines Hotel, a fine old stone building in extensive grounds". This is suggested as an alt text, rather than as a caption, because those readers who can see the picture will already be able to see it for themselves. If you want to also offer them a link to the picture, then do so, in one of the ways mentioned above. (Even a blind reader might want to download the picture so that they can show it to a friend later.)

Readers of "type II" already have browser facilities that allow them to retrieve the image if they so choose (yes, even those who use Lynx, especially recent versions, that put the inlines only a keystroke away); there is nothing extra that the author really needs to do about it - and after several discussions of what the author could do, nobody seems to have come up with anything that really provides worthwhile additional help to the text-only user without needlessly distracting the graphics-based user. (As so often on the WWW, the important thing is to mark up the information honestly for what it is, rather than trying to out-guess the browser designer; if there are limitations in the facilities that one or other browser offers, then the place to remedy them is in the browser design, not in the author's HTML source.) When I mentioned in a usenet discussion my dissatisfaction with the above text, "Picture of Hotel", someone helpfully suggested "Download picture of Hotel". I hope that by now I've made it clear why, far from being an improvement, this seems to me to be even worse.

One suggestion was to capitalise on Lynx's "[IMAGE]" notation by putting ALT="[Picture]" or ALT="[Picture of Hotel]", so that Lynx users would associate this with an image. That seems a good idea, but the particular examples still convey no information to "type III" readers. Isn't this better? - ALT="[The Pines Hotel, a fine old stone building...]".

As I said before: if the image is mandatory to your presentation, then say so plainly. If not, then don't pester the reader to load it: indicate to them what information it contains that's relevant to the "topic of discourse", and leave them to take whatever action they consider appropriate.

As for pure decorations, your readers don't want to see [IMAGE] sprinkled around! So be sure to code an explicit ALT="" for your decorative images, and provide something suitable for bullets and rules. [How sad that the browser makers refused to implement the SRC="..." attribute to UL, LI and HR, as set out in the now-expired HTML3.0 draft. But we must make do with what we've got, for now.] Please, do not use character code points that are undefined in the ISO-8859-1 code but just happen to produce a bullet on your particular platform: these may produce anything, or nothing, on other platforms.

"This page has been visited [Counter Image] times"
"Accessed [a bitmapped number] times since 12/1/95"
(and several variations on this theme).

Well, this has nothing to do with the "topic of discourse". I don't think any of the variations could be claimed to be good style (not forgetting that 12/1/95 means something different to European readers than what it means to USAns). But I can't work up any enthusiasm for page counters, so I'm not going to suggest a better way of doing it. If you're a page counter fan, I'm sure you'll work something out with a bit of Server Side Includery or CGI. What I am going to suggest is that you take a look at a tongue in cheek guide to setting up your Web Site.

Spacing between alt texts

(Thanks to Toby Speight for prompting me to address this issue.)

Consider some images crammed together, for example as navigation buttons:

<IMG SRC=univ.gif ALT="The University"><IMG SRC=town.gif ALT="The Town">...

When viewed on a text-mode browser, this is going to read:

The UniversityThe Town...

So, take care to put spaces in an appropriate place, or make some other provision for spacing the texts. Bracketing e.g ALT="[The University]" is a popular choice: vertical bars are also sometimes used.

HEIGHT and WIDTH?

There has been a long-running debate on *.html about the wisdom of providing correct HEIGHT and WIDTH attributes on IMG tags. The idea is that the browser can reserve space for the image before the image is retrieved, and in consequence can present the normal text properly formatted on the page as soon as the text is available, simply slotting the images into place later as they arrive.

Browser versions differ in how they support this when text loading is turned off. Some (and this is true of e.g recent Netscape versions) will display the ALT text only partially, or not at all, if the specified rectangle is too small. The behaviour while waiting for image loading to be completed (some browsers display the ALT text during this interval) may or may not be the same as the behaviour when image loading is turned off. Too many variations of behaviour have been seen, between browsers and between versions, to be able to give an account of them here. On balance I think the only advice I can offer is to normally include correct HEIGHT and WIDTH information; but if these are likely to be too small for the text, then you might decide to deliberately omit the h/w attributes. Finally, don't include incorrect HEIGHT and WIDTH information: some browsers may be meant to re-size the image to fit, but the results are typically not good.

INPUT TYPE=IMAGE

(Now moved to a separate page.)

A reader comment

A correspondent suggests that many of the abuses and howlers described here are being caused by the "authoring tools" which authors are using to create their WWW documents. That isn't much of an excuse, is it? HTML markup is simple enough already: certainly it makes sense to use an appropriate software tool if it genuinely saves drudgery (creating TABLEs, for example, or automatically inserting the correct image height and width in the situations where you want it). But when the tool prevents you from producing documents that conform to good authoring style, then you should be questioning your original decision to rely on that tool.

A few howlers

All of these examples represent things I have seen on the WWW.

Reality check - typical scenarios in text mode:

                Click on your choice now:
                        [LINK]
                        [LINK]

or this, as seen on a unix system for which their recommended browser is not in fact available:

    Our site is best experienced with: [LINK]Click to Get It!

then, there was this great piece of advertising...

        Another fine Web sight from [Company Logo]

or, to quote from an information site of a major corporation:

                spacer image

                [INLINE]

                spacer image

But you wouldn't do that to your readers, would you? (;-)

"This site is [LINK]etscape [INLINE]"

To see some samples ranging from the slightly silly to the mind blowingly stupid, take a few minutes to browse the results of an AltaVista search for the string "etscape", and look at the various HTML sources - quite instructive!

ALT="Large Yellow Bullet"

Very clever. So we get to read (or blind readers get to hear):

    Large Yellow Bullet Introduction
    Large Yellow Bullet The Problem
          Small Red Bullet Historical Analysis
          Small Red Bullet Current Situation
    Large Yellow Bullet The Solution

(Yes, I have genuinely seen this kind of thing on the WWW, I am not making this up. What were these authors thinking of?)

ALT="This image is mapped, please download it"

For Lynx users, this does not help. I've already made some more-appropriate suggestions above. See my accompanying document for more detail and references.

ALT="Imagemap of various flags"

Little better than the previous one. What would a text-only user want with an image of some unspecified flags? This is supposed to be a navigation tool, not a guessing game!

Alt="(Sorry, Not Available With Your Web Client)"

Nonsense! I was using Netscape with image loading turned off. Even if I had been using Lynx, who are you to say that I can't fire up a helper application to see this image?

ALT="Put your alt text here"

¿Qué?

<IMG SRC=left.gif> <IMG SRC=up.gif> <IMG SRC=right.gif>...<BR> [Previous] [Up] [Next]... (both rows used an identical sequence of anchors, which I omitted here for brevity).

Hmmm. Graphics users get both a self-explanatory row of image links and a text-mode row that does the same thing. If they have image loading turned off, they get a puzzling row of missing-image icons, followed by a meaningful text-mode row. Text mode users get a puzzling row of [LINK] [LINK] [LINK]... followed by a useful text-mode row. If the author couldn't handle images better than this, they should have just used the text-mode bar, and left the images alone!!

[THUMBNAIL] [THUMBNAIL] [THUMBNAIL] [THUMBNAIL] [THUMBNAIL]

A heroically misguided attempt to comply with the authoring advice to "be sure to provide an ALT attribute on every IMG". Sorry.

"Photo of a bull in the water canoeing"

I beg your pardon? Ah, here's what went wrong:

<IMG SRC="bull.jpg" ALT="Photo of a bull in the water"> <IMG SRC="canoe.jpg" ALT="canoeing">

"Academic departments are indicated by pink bullets".

Not very useful to those browsing in text mode. As a first improvement, replace that with e.g:

Academic departments are indicated thus: <IMG SRC="pinkbullet.gif" ALT="[*]">

(making the corresponding adjustment to the list itself, of course) and the result will make sense for both kinds of reader. However, we still haven't addressed the monochrome display. Better would be to use a distinctive shape (in this case e.g a mortarboard) that would still be meaningful in monochrome, instead of trying to rely on colour distinctions; together with, of course, distinctive ALT markers for the various kinds of department.

A careful Web author will separate their content clearly from any details of its presentation. This was a case of "leakage", where the author had made the mistake of referring from the text to some aspect ("pink bullets") of the presentation that would only be perceived by a subset of readers.

"Click on the green text"

The author had turned a few dozen bytes of text into an image, which, as it happened, was green. They had then remembered to supply the same text in the ALT attribute, but had apparently forgotten (1) that Lynx users don't actually "click on the text" and (2) the ALT text wasn't green, indeed, the author could have no idea what colour it was. Apart from those two points, and the fact that readers who are on slow networks do not find it any particular benefit that a dozen bytes of text has been turned into several kilobytes of image, the author had done a fine job ;-). (Another case of "leakage", I'd say, on top of the highly infectious "click here" disease.)

My favourite howler went something like this

<CENTER>
<FONT SIZE=6>Our Classrooms and Staff</FONT>
<IMG SRC="rule.gif" ALT="fancy horizontal rule">
</CENTER>

Note that instead of using <H1> as was appropriate for this first level header, they had simply marked it up with a Netscapist font size. So, there was no linebreak of any kind implied between the text and the image. Now, when displayed with image loading on, this was not a problem: the "fancy horizontal rule" was so big that it automatically went onto a new line. However, with Lynx this whole thing was quaintly rendered as

Our Classrooms and Staff fancy horizontal rule

Apart from using H1 around the title text, as was clearly appropriate on that particular page, my solution would be something like this:

<P ALIGN=CENTER><IMG SRC="rule.gif" ALT=". o O o ."></P>

if you really feel that a decoration is needed here (choose your own decorative text string ad lib).

(Comment: you may be able to find some of the above howlers with search engines such as AltaVista; others have been adapted by simplifying the original, or taking two or more similar examples and composing one that represents them all.)

I don't believe that any particular familiarity with a text browser, nor indeed with a speaking machine, was required in order to get those examples right. If the document had been marked up properly, by following the guidelines given at the W3C and giving appropriate thought to the content that is to be communicated to the reader, rather than getting side-tracked by the mechanics of the WWW, then it would have "worked" on every browser - and searcher and indexer.

To sum up:

Think what information will be presented to each of the three types of user
Consider how that information illuminates the "topic of discourse"
And then all should be clear.

A final parting shot. When your paperback edition is published, does it include an "ALT" text that tells the reader what a cheapskate they are, and how they should have bought the hardback edition with the eight extra illustrations, and the handsome dustcover? I think not. Please don't address your text-mode readers [3] as if they were second class citizens, either.

My thanks and best regards to all who have contributed to the discussions on earlier drafts of this note. Special thanks to Henry Churchyard for contributing the "Enhanced for Lynx" icon.

The contents of this article were originally published at http://ppewww.ph.gla.ac.uk/%7Eflavell/alt/alt-text.html, where they are currently maintained.

Home, Questions, Members, WDG Award, Reference, Design, Links